Experiments on Optimizing the Performance of Stencil Codes with SPL Conqueror
نویسندگان
چکیده
A standard technique for numerically solving elliptic partial differential equations on structured grids is to discretize them, and, then, to apply an efficient geometric multi-grid solver. Unfortunately, finding the optimal choice of multi-grid components and parameter settings is challenging and existing auto-tuning techniques fail to explain performanceoptimal settings. To improve the state of the art, we explore whether recent work on optimizing configurations of product lines can be applied to the stencil-code domain. In particular, we extend the domain-independent tool SPL Conqueror in an empirical study to predict the performance-optimal configurations of three geometric multi-grid stencil codes: a program using HIPAcc, the evaluation prototype HSMGP, and a program using DUNE. For HIPAcc, we reach an prediction accuracy of 96 %, on average, measuring only 21.4 % of all configurations; we predict a configuration that is nearly optimal after measuring less than 0.3 % of all configurations. For HSMGP, we predict performance with an accuracy of 97 % including the performance-optimal configuration, while measuring 3.2 % of all configurations. For DUNE, we predict performance of all configurations with an accuracy of 86 % after measuring 3.3 % of all configurations. The performance-optimal configuration is within the 0.5 % configurations predicted to perform best.
منابع مشابه
Optimizing Performance of Stencil Code with SPL Conqueror
A standard technique to numerically solve elliptic partial differential equations on structured grids is to discretize them via finite differences and then to apply an efficient geometric multi-grid solver. Unfortunately, finding the optimal choice of multi-grid components and parameters is challenging and platform dependent, especially, in cases where domain knowledge is incomplete. Auto-tunin...
متن کاملDomain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model
Our aim is to apply program transformations to stencil codes in order to yield the highest possible performance. We recognize memory bandwidth as a major limitation in stencil code performance. We conducted a study in which we applied optimizing transformations to two Jacobi smoother kernels: one 3D 1st-order 7-point stencil and one 3D 3rd-order 19-point stencil. To obtain high performance, the...
متن کاملOptimizing Transformations of Stencil Operations for Parallel Cache-based Architectures
This paper describes a new technique for optimizing serial and parallel stencil-and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicitly in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its speci-city it could not be expected of a conventional compiler. Empiri...
متن کاملGuest Editors' Note: Special Issue On High-Performance Stencil Computations
This workshop is the first in a new series of workshops intended to address current and upcoming challenges and developments in the area of stencil computations. Today, real-world stencil codes are often hand-tuned which requires a huge amount of engineering effort given the variety of stencil codes in use. Therefore, simplifying the task of constructing stencil codes that deliver high performa...
متن کاملOptimizing Stencil Codes Using Search
Stencil-based kernels constitute the core of many scientific applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance. In this paper, we demonstrate one method for automatically generating high-quality stencil code. First, we perform a search over various instruction-level optimizations to find the best platform-specific combination. These op...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Parallel Processing Letters
دوره 24 شماره
صفحات -
تاریخ انتشار 2014